A comparison of LPC and FFT-based acoustic features for noise robust ASR

نویسندگان

  • Febe de Wet
  • Bert Cranen
  • Johan de Veth
  • Lou Boves
چکیده

Within the context of robust acoustic features for automatic speech recognition (ASR), we evaluated mel-frequency cepstral coefficients (MFCCs) derived from two spectral representation techniques, i.e. the fast Fourier transform (FFT) and linear pre­ dictive coding (LPC). ASR systems based on the two feature types were tested on a digit recognition task using continuous density hidden Markov phone models. System performance was determined in clean acoustic conditions as well as in differ­ ent simulations of adverse acoustic conditions. The LPC-based MFCCs outperformed their FFT counterparts in most of the ad­ verse acoustic conditions that were investigated in this study. A tentative explanation for this difference in recognition perfor­ mance is given.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Comparing acoustic features for robust ASR in fixed and cellular network applications

Within the context of automatic speech recognition (ASR) applications for telephony, we investigate the acoustic pre-processing issues that are at stake in going from the xed line to the cellular network. Because the spectral representation used in enhanced full rate GSM is linear prediction, we investigate the relative advantages and drawbacks of conventional mel-frequency cepstral coeecient (...

متن کامل

Improving the performance of MFCC for Persian robust speech recognition

The Mel Frequency cepstral coefficients are the most widely used feature in speech recognition but they are very sensitive to noise. In this paper to achieve a satisfactorily performance in Automatic Speech Recognition (ASR) applications we introduce a noise robust new set of MFCC vector estimated through following steps. First, spectral mean normalization is a pre-processing which applies to t...

متن کامل

Robust energy demodulation based on continuous models with application to speech recognition

In this paper, we develop improved schemes for simultaneous speech interpolation and demodulation based on continuous-time models. This leads to robust algorithms to estimate the instantaneous amplitudes and frequencies of the speech resonances and extract novel acoustic features for ASR. The continous-time models retain the excellent time resolution of the ESAs based on discrete energy operato...

متن کامل

Uncertainty Decoding with Adaptive Sampling for Noise Robust DNN-Based Acoustic Modeling

Although deep neural network (DNN) based acoustic models have obtained remarkable results, the automatic speech recognition (ASR) performance still remains low in noise and reverberant conditions. To address this issue, a speech enhancement front-end is often used before recognition to reduce noise. However, the front-end cannot fully suppress noise and often introduces artifacts that are limit...

متن کامل

Speech recognition system robust to noise and speaking styles

It is difficult to recognize speech distorted by various factors, especially when an ASR system contains only a single acoustic model. One solution is to use multiple acoustic models, one model for each different condition. In this paper, we discuss a parallel decoding-based ASR system that is robust to the noise type, SNR, speaker gender and speaking style. Our system consists of two recogniti...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001